Supporting Parallel Applications on Clusters of Workstations: The Intelligent Network Interface Approach
نویسندگان
چکیده
This paper presents a novel networking architecture designed for communication intensive parallel applications running on clusters of workstations (COWs) connected by high speed networks. This architecture permits (1) the transfer of selected communication-related functionality from the host machine to the network interface coprocessor, and (2) the exposure of this functionality directly to applications as instructions of a Virtual Communication Machine (VCM) implemented by the coprocessor. The user-level code interacts directly with the network coprocessor as the host kernel only ’connects’ the application to the VCM and does not participate in the data transfers. The distinctive feature of our design is its flexibility: the integration of the network with the applicationcan be varied to maximize performance. The resulting communication architecture is characterized by a very low overhead on the host processor, by latency and bandwidth close to the hardware limits, and by an application interface which enables zero-copy messaging and eases the port of some shared-memory parallel applications to COWs. The architecture admits low cost implementations based only on off-the-shelf hardware components. Additionally, its current ATM-based implementation can be used to communicate with any ATM-enabled host. We use three applications to demonstrate the high performance of the architecture’s current implementation: (1) a synthetic client/server application, (2) a parallel implementation of the Traveling Salesman Problem, and (3) a parallel engine for discrete event simulation(GTW). The distributedand shared-memory versions of these applications have comparable performance. Furthermore, the VCM-based approach to creating a clustered, parallel machine is shown to scale well in terms of matching required with offered communication bandwidths. Partially funded by DARPA grant DABT63-95-C-0125 and by NSF Grants CDA-9501637, CDA-9422033, ECS-9411846, and MIP-94085550
منابع مشابه
Supporting Dynamic Space-sharing on Clusters of Non-dedicated Workstations
Clusters of workstations are increasingly being viewed as a cost-effective alternative to parallel supercomputers. However, resource management and scheduling on workstations clusters is complicated by the fact that the number of idle workstations available for executing parallel applications is constantly fluctuating. In this paper, we present a case for scheduling parallel applications on non...
متن کاملAn Intelligent Computer Interface Utilizing Parallel Picocontrollers (TECHNICAL NOTE)
The design of an interface unit is described, in which RS232 serial data is converted to latched parallel data on 22 independent lines. The data direction of each line is programmable through the serial port. Two picocontrollers are employed in a parallel processing mode to give the required number of I/O pins, and data on the shared serial line is coded to separate data streams to the individu...
متن کاملImpact of Latency on Applications’ Performance
This paper investigates the impact of point-topoint latency on applications’ performance on clusters of workstations interconnected with high-speed networks. At present, clusters are often evaluated through comparison of point-to-point latency and bandwidth obtained by ping-pong tests. This paper shows that this approach to performance evaluation of clusters has limited validity and that latenc...
متن کاملA Hybrid Neural Network Approach for Kinematic Modeling of a Novel 6-UPS Parallel Human-Like Mastication Robot
Introduction we aimed to introduce a 6-universal-prismatic-spherical (UPS) parallel mechanism for the human jaw motion and theoretically evaluate its kinematic problem. We proposed a strategy to provide a fast and accurate solution to the kinematic problem. The proposed strategy could accelerate the process of solution-finding for the direct kinematic problem by reducing the number of required ...
متن کاملA Slowdown Model for Applications Executing on Time-Shared Clusters of Workstations
ÐDistributed applications executing on clustered environments typically share resources (computers and network links) with other applications. In such systems, application execution may be retarded by the competition for these shared resources. In this paper, we define a model that calculates the slowdown imposed on applications in time-shared multi-user clusters. Our model focuses on three kin...
متن کامل